NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Log-related Coding Patterns to Conduct Postmortems of Attacks in Supervised Learning-based Projects

https://doi.org/10.1145/3568020

Bhuiyan, Farzana Ahamed; Rahman, Akond (May 2023, ACM Transactions on Privacy and Security)

Adversarial attacks against supervised learning a algorithms, which necessitates the application of logging while using supervised learning algorithms in software projects. Logging enables practitioners to conduct postmortem analysis, which can be helpful to diagnose any conducted attacks. We conduct an empirical study to identify and characterize log-related coding patterns, i.e., recurring coding patterns that can be leveraged to conduct adversarial attacks and needs to be logged. A list of log-related coding patterns can guide practitioners on what to log while using supervised learning algorithms in software projects. We apply qualitative analysis on 3,004 Python files used to implement 103 supervised learning-based software projects. We identify a list of 54 log-related coding patterns that map to six attacks related to supervised learning algorithms. Using Lo g Assistant to conduct P ostmortems for Su pervised L earning ( LOPSUL ) , we quantify the frequency of the identified log-related coding patterns with 278 open-source software projects that use supervised learning. We observe log-related coding patterns to appear for 22% of the analyzed files, where training data forensics is the most frequently occurring category.
more » « less
Full Text Available
Shifting Left for Machine Learning: An Empirical Study of Security Weaknesses in Supervised Learning-based Projects

https://doi.org/10.1109/COMPSAC54236.2022.00130

Bhuiyan, Farzana Ahamed; Prowell, Stacy; Shahriar, Hossain; Wu, Fan; Rahman, Akond (June 2022, 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 2022)

Context: Supervised learning-based projects (SLPs), i.e., software projects that use supervised learning algorithms, such as decision trees are useful for performing classification-related tasks. Yet, security weaknesses, such as the use of hard-coded passwords in SLPs, can make SLPs susceptible to security attacks. A characterization of security weaknesses in SLPs can help practitioners understand the security weaknesses that are frequent in SLPs and adopt adequate mitigation strategies. Objective: The goal of this paper is to help practitioners securely develop supervised learning-based projects by conducting an empirical study of security weaknesses in supervised learning-based projects. Methodology: We conduct an empirical study by quantifying the frequency of security weaknesses in 278 open source SLPs. Results: We identify 22 types of security weaknesses that occur in SLPs. We observe ‘use of potentially dangerous function’ to be the most frequently occurring security weakness in SLPs. Of the identified 3,964 security weaknesses, 23.79% and 40.49% respectively, appear for source code files used to train and test models. We also observe evidence of co-location, e.g., instances of command injection co-locates with instances of potentially dangerous function. Conclusion: Based on our findings, we advocate for a shift left approach for SLP development with security-focused code reviews, and application of security static analysis.
more » « less
Full Text Available
Towards Automation for MLOps: An Exploratory Study of Bot Usage in Deep Learning Libraries

https://doi.org/10.1109/COMPSAC54236.2022.00171

Rahman, Akond; Bhuiyan, Farzana Ahamed; Hassan, Mohammad Mehedi; Shahriar, Hossain; Wu, Fan (June 2022, 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), 2022)

Machine learning (ML) operations or MLOps advocates for integration of DevOps- related practices into the ML development and deployment process. Adoption of MLOps can be hampered due to a lack of knowledge related to how development tasks can be automated. A characterization of bot usage in ML projects can help practitioners on the types of tasks that can be automated with bots, and apply that knowledge into their ML development and deployment process. To that end, we conduct a preliminary empirical study with 135 issues reported mined from 3 libraries related to deep learning: Keras, PyTorch, and Tensorflow. From our empirical study we observe 9 categories of tasks that are automated with bots. We conclude our work-in-progress paper by providing a list of lessons that we learned from our empirical study.
more » « less
Full Text Available

Search for: All records